NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Neuro-Symbolic Concepts

Mao, Jiayuan; Tenenbaum, Joshua B; Wu, Jiajun (July 2025, Communications of the ACM)

Free, publicly-accessible full text available July 1, 2026
One-shot manipulation strategy learning by making contact analogies.

Liu, Yuyao; Mao, Jiayuan; Tenenbaum, Joshua; Lozano-Perez, Tomas; Kaelbling, Leslie (June 2025, Proceedings IEEE International Conference on Robotics and Automation)

We present a novel approach, MAGIC (manipulation analogies for generalizable intelligent contacts), for one-shot learning of manipulation strategies with fast and extensive generalization to novel objects. By leveraging a reference action trajectory, MAGIC effectively identifies similar contact points and sequences of actions on novel objects to replicate a demonstrated strategy, such as using different hooks to retrieve distant objects of different shapes and sizes. Our method is based on a twostage contact-point matching process that combines global shape matching using pretrained neural features with local curvature analysis to ensure precise and physically plausible contact points. We experiment with three tasks including scooping, hanging, and hooking objects. MAGIC demonstrates superior performance over existing methods, achieving significant improvements in runtime speed and generalization to different object categories. Website: https://magic-2024.github.io/.
more » « less
Free, publicly-accessible full text available June 2, 2026
Keypoint abstraction using large models for object-relative imitation learning.

Fang, Xiaolin; Huang, Bo-Ruei; Mao, Jiayuan; Shone, Jasmine; Tenenbaum, Joshua; Lozano-Perez, Tomas; Kaelbling, Leslie (June 2025, Proceedings IEEE International Conference on Robotics and Automation)

Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics. Keypoint-based representations have been proven effective as a succinct representation for capturing essential object features, and for establishing a reference frame in action prediction, enabling data-efficient learning of robot skills. However, their manual design nature and reliance on additional human labels limit their scalability. In this paper, we propose KALM, a framework that leverages large pre-trained vision-language models (LMs) to automatically generate taskrelevant and cross-instance consistent keypoints. KALM distills robust and consistent keypoints across views and objects by generating proposals using LMs and verifies them against a small set of robot demonstration data. Based on the generated keypoints, we can train keypoint-conditioned policy models that predict actions in keypoint-centric frames, enabling robots to generalize effectively across varying object poses, camera views, and object instances with similar functional shapes. Our method demonstrates strong performance in the real world, adapting to different tasks and environments from only a handful of demonstrations while requiring no additional labels.
more » « less
Free, publicly-accessible full text available June 2, 2026
What Makes a Maze Look Like a Maze?

Hsu, Joy; Mao, Jiayuan; Tenenbaum, Joshua B; Goodman, Noah D; Wu, Jiajun (May 2025, International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available May 1, 2026
Hybrid declarative-imperative representations for hybrid discrete-continuous decision-making

Mao, Jiayuan; Tenenbaum, Joshua; Lozano-Perez, Tomas; Kaelbling, Leslie (October 2024, Workshop on Algorithmic Foundations of Robotics)

We present a robot-behavior description language cdl that can express both direct imperative strategies and planning-based strategies, and combine them seamlessly within the same program. Accompanying this language is a general-purpose planner Crow, which interprets the behavior description and searches as necessary to find a sound plan. We demonstrate (1) via example programs, that cdl can be used to specify, very intuitively, different known strategies for navigation among movable obstacle (NAMO) problems, (2) via empirical results, that Crow can take advantage of the priors expressed in cdl to very quickly solve problem instances with known simplifying structure but still generalize to hard instances, and (3) via theory, that width, a powerful characterization of the worst-case complexity of planning problems, corresponds to a natural property of cdl descriptions and that Crow operates in time on the same order as the width-based worst-case complexity.
more » « less
Full Text Available
Learning Planning Abstractions from Language

Liu, Weiyu; Chen, Geng; Hsu, Joy; Mao, Jiayuan; Wu, Jiajun (May 2024, International Conference on Learning Representations (ICLR))

Full Text Available
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making

Li, Manling; Zhao, Shiyu; Wang, Qineng; Wang, Kangrui; Zhou, Yu; Srivastava, Sanjana; Gokmen, Cem; Lee, Tony; Li, Li Erran; Zhang, Ruohan; et al (December 2024, Advances in Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks))

Full Text Available
Learning adaptive planning representations with natural language guidance

Wong, Lionel; Mao, Jiayuan; Sharma, Pratyusha; Siegel, Zachary; Feng, Jiahai; Korneev, Noa; Tenenbaum, Joshua B; Andreas, Jacob (May 2024, International Conference on Learning Representations)

Effective planning in the real world requires not only world knowledge, but the ability to leverage that knowledge to build the right representation of the task at hand. Decades of hierarchical planning techniques have used domain-specific temporal action abstractions to support efficient and accurate planning, almost always relying on human priors and domain knowledge to decompose hard tasks into smaller subproblems appropriate for a goal or set of goals. This paper describes Ada (Action Domain Acquisition), a framework for automatically constructing task-specific planning representations using task-general background knowledge from language models (LMs). Starting with a general-purpose hierarchical planner and a low-level goal-conditioned policy, Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks. On two language-guided interactive planning benchmarks (Mini Minecraft and ALFRED Household Tasks), Ada strongly outperforms other approaches that use LMs for sequential decision- making, offering more accurate plans and better generalization to complex tasks.
more » « less
Full Text Available
What Planning Problems Can A Relational Neural Network Solve?

Mao, Jiayuan; Lozano-Perez, Tomas; Tenenbaum, Joshua; Kaelbling, Leslie (December 2023, Advances in neural information processing systems (NeurIPS) 2023)

Goal-conditioned policies are generally understood to be “feed-forward” circuits, in the form of neural networks that map from the current state and the goal specifi- cation to the next action to take. However, under what circumstances such a policy can be learned and how efficient the policy will be are not well understood. In this paper, we present a circuit complexity analysis for relational neural networks (such as graph neural networks and transformers) representing policies for planning problems, by drawing connections with serialized goal regression search (S-GRS). We show that there are three general classes of planning problems, in terms of the growth of circuit width and depth as a function of the number of objects and planning horizon, providing constructive proofs. We also illustrate the utility of this analysis for designing neural networks for policy learning.
more » « less
Full Text Available
Learning Reusable Manipulation Strategies

Mao, Jiayuan; Tenenbaum, Joshua; Lozano-Perez, Tomas; Kaelbling, Leslie (November 2023, Proceedings of Machine Learning Research: Conference on Robot Learning (CoRL) 2023)

Humans demonstrate an impressive ability to acquire and generalize manipulation “tricks.” Even from a single demonstration, such as using soup ladles to reach for distant objects, we can apply this skill to new scenarios involving different object positions, sizes, and categories (e.g., forks and hammers). Addi- tionally, we can flexibly combine various skills to devise long-term plans. In this paper, we present a framework that enables machines to acquire such manipulation skills, referred to as “mechanisms,” through a single demonstration and self-play. Our key insight lies in interpreting each demonstration as a sequence of changes in robot-object and object-object contact modes, which provides a scaffold for learning detailed samplers for continuous parameters. These learned mechanisms and samplers can be seamlessly integrated into standard task and motion planners, enabling their compositional use.
more » « less
Full Text Available

« Prev Next »

Search for: All records